1,995 research outputs found

    Correcting discount-factor mismatch in on-policy policy gradient methods

    Full text link
    The policy gradient theorem gives a convenient form of the policy gradient in terms of three factors: an action value, a gradient of the action likelihood, and a state distribution involving discounting called the \emph{discounted stationary distribution}. But commonly used on-policy methods based on the policy gradient theorem ignores the discount factor in the state distribution, which is technically incorrect and may even cause degenerate learning behavior in some environments. An existing solution corrects this discrepancy by using Ī³t\gamma^t as a factor in the gradient estimate. However, this solution is not widely adopted and does not work well in tasks where the later states are similar to earlier states. We introduce a novel distribution correction to account for the discounted stationary distribution that can be plugged into many existing gradient estimators. Our correction circumvents the performance degradation associated with the Ī³t\gamma^t correction with a lower variance. Importantly, compared to the uncorrected estimators, our algorithm provides improved state emphasis to evade suboptimal policies in certain environments and consistently matches or exceeds the original performance on several OpenAI gym and DeepMind suite benchmarks

    Real-Time Reinforcement Learning for Vision-Based Robotics Utilizing Local and Remote Computers

    Full text link
    Real-time learning is crucial for robotic agents adapting to ever-changing, non-stationary environments. A common setup for a robotic agent is to have two different computers simultaneously: a resource-limited local computer tethered to the robot and a powerful remote computer connected wirelessly. Given such a setup, it is unclear to what extent the performance of a learning system can be affected by resource limitations and how to efficiently use the wirelessly connected powerful computer to compensate for any performance loss. In this paper, we implement a real-time learning system called the Remote-Local Distributed (ReLoD) system to distribute computations of two deep reinforcement learning (RL) algorithms, Soft Actor-Critic (SAC) and Proximal Policy Optimization (PPO), between a local and a remote computer. The performance of the system is evaluated on two vision-based control tasks developed using a robotic arm and a mobile robot. Our results show that SAC's performance degrades heavily on a resource-limited local computer. Strikingly, when all computations of the learning system are deployed on a remote workstation, SAC fails to compensate for the performance loss, indicating that, without careful consideration, using a powerful remote computer may not result in performance improvement. However, a carefully chosen distribution of computations of SAC consistently and substantially improves its performance on both tasks. On the other hand, the performance of PPO remains largely unaffected by the distribution of computations. In addition, when all computations happen solely on a powerful tethered computer, the performance of our system remains on par with an existing system that is well-tuned for using a single machine. ReLoD is the only publicly available system for real-time RL that applies to multiple robots for vision-based tasks.Comment: Appears in Proceedings of the 2023 International Conference on Robotics and Automation (ICRA). Source code at https://github.com/rlai-lab/relod and companion video at https://youtu.be/7iZKryi1xS

    Association of metabolic dysregulation with volumetric brain magnetic resonance imaging and cognitive markers of subclinical brain aging in middle-aged adults: the Framingham Offspring Study.

    Get PDF
    ObjectiveDiabetic and prediabtic states, including insulin resistance, fasting hyperglycemia, and hyperinsulinemia, are associated with metabolic dysregulation. These components have been individually linked to increased risks of cognitive decline and Alzheimer's disease. We aimed to comprehensively relate all of the components of metabolic dysregulation to cognitive function and brain magnetic resonance imaging (MRI) in middle-aged adults.Research design and methodsFramingham Offspring participants who underwent volumetric MRI and detailed cognitive testing and were free of clinical stroke and dementia during examination 7 (1998-2001) constituted our study sample (n = 2,439; 1,311 women; age 61 Ā± 9 years). We related diabetes, homeostasis model assessment of insulin resistance (HOMA-IR), fasting insulin, and glycohemoglobin levels to cross-sectional MRI measures of total cerebral brain volume (TCBV) and hippocampal volume and to verbal and visuospatial memory and executive function. We serially adjusted for age, sex, and education alone (model A), additionally for other vascular risk factors (model B), and finally, with the inclusion of apolipoprotein E-Īµ4, plasma homocysteine, C-reactive protein, and interleukin-6 (model C).ResultsWe observed an inverse association between all indices of metabolic dysfunction and TCBV in all models (P < 0.030). The observed difference in TCBV between participants with and without diabetes was equivalent to approximately 6 years of chronologic aging. Diabetes and elevated glycohemoglobin, HOMA-IR, and fasting insulin were related to poorer executive function scores (P < 0.038), whereas only HOMA-IR and fasting insulin were inversely related to visuospatial memory (P < 0.007).ConclusionsMetabolic dysregulation, especially insulin resistance, was associated with lower brain volumes and executive function in a large, relatively healthy, middle-aged, community-based cohort

    Assessment of the outcomes of open side-to-side choledochoduodenostomy in the management of choledocholithiasis

    Get PDF
    Background: Gallstone disease is one of the most common digestive diseases leading to frequent hospital visits and its prevalence shows ethnic variability, with rates of approximately 10-15% in the United States and Europe. The present study aims to prospectively assess the outcomes of open side-to-side choledochoduodenostomy in the management of choledocholithiasis. Methods: This hospital-based prospective observational study was conducted in the Department of Surgery, Tezpur medical College and Hospital, Tezpur, over one year period, from July 2021 to June 2022. The study includes twenty-four patients admitted to the surgery department for bile duct stone operations. After intraoperative confirmation of the criteria, these patients underwent choledochoduodenostomy. The patients were followed for 2 months postoperatively after discharge. Results: Only a few patients had immediate postoperative complications which were managed conservatively. No patient had any evidence of retained stone, nor did they have any symptoms of cholangitis, features suggestive of the development of Sump syndrome, or any other follow-up postoperative complications. Conclusion: Open side-to-side choledochoduodenostomy should be considered a method of choice in remote areas where endoscopic facilities are lacking and in patients where cost is a factor in deciding the choice of procedure, with reduced postoperative complications like retained stones and a shorter duration of hospital stay in expert surgical hands
    • ā€¦
    corecore